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INTELLIGENT MODELING, TRANSFORMATION AND MANIPULATION 

SYSTEM 

CROSS REFERENCE TO RELATED UNITED STATES PATENT 

APPLICATION 

This patent application is related to United States provisional patent 
application Serial No. 60/168,020 filed on November 30, 1999 entitled 
INTELLIGENT MODELING, TRANSFORMATION AND MANIPULATION (IMTM) 
SYSTEM. 

FIELD OF THE INVENTION 

The present invention relates to a method of intelligent 2D and 3D object 
and scene modeling, transformation and manipulation and more particularly this 
invention relates to the field of computer modeling, virtual reality, animation and 
3D Web streaming. 

BACKGROUND OF THE INVENTION 

Studies of the human vision system show that the analysis of dynamic 
scene involves both low-level processing at the retina and high-level knowledge 
processing in the brain, see P. Buser and M. Imbert, Vision, translated by R. H. 
Kay, pp. 137-151, The MIT Press, 1992. Su-Shing Chen, Structure form Motion 
without the Rigidity Assumption, Proceedings of the 3rd Workshop on Computer. 
For motion analysis, it has been shown that the human vision system captures 
both high-level structures and low-level motions of a dynamic scene, see D. Burr 
and J. Ross, Visual Analysis during Motion, Vision, Brian, and Cooperative 
Computation, pp. 187-207, edited by M. A. Arbib and A. RHanson, The MIT 
Press, 1987. Unfortunately, current vision and graphics systems do not satisfy 
this requirement. Popular representations such as those taught by M. Kass, A. 
Witkin and D. Terzopoulos, Snakes: Active Contour Models, International 
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Journal of Computer Vision, pp.321-331, Kluwer Academic Publishers, Boston, 
1988, and 0. N. Lee, The Optical Flow Field: The Foundation of Vision, 
Philosophical Transactions of the Royal Society of London, B290, pp, 169* 
179,1980, do not enable symbolic or knowledge manipulation. These systems 
5 generally lack the capability of automated learning unknown types of motion and 
movements of modeled objects. 

In the fast growing virtual reality society, realistic visual modeling for virtual 
objects has never been so eagerly needed in its history. Previous modeling 
techniques mainly look after geometrical appearance 1 or physical features. 
10 However, as pointed out in J. Bates, Deep Structure for Virtual Reality, Technical 
Report, CMU-CS-9M33, Carnegie Mallon University, May, 1991, and G. Burdea 
and P. Coiffet, Virtual Reality Technology, John Wiley and Sons, !nc M 1994, an 
O ideal object modeling has at least the following requirements: appearance 

J modeling for geometric shapes; kinematics modeling for the rotations and 

\u 15 translations of objects; physical modeling of various properties such as the mass y 
S inertia, deformation factors of objects to mention just a few; and behavioral 

^ features such as intelligence and emotions. 

Similar requirements arise from many Internet applications where there is 
Q a fast growing interest in 3D Web contents. Current 3D Web methodologies 

W 20 (such as VRML, see D, Brutzman, M. Pesce, 6. Bell, A. Dam and S, Abiezzi, 
3 VRML, Prelude and Future, ACM SIGGRAPH '96, pp. 489-490, New Orleans, 

u August 1 996) heavily depend on Internet bandwidth to transmit 3D data. 

Therefore, an efficient 3D representation is required to "compress" the complex 
and multi-aspect 3D data. 
25 To satisfy all these requirements, there is needed a generic structure to 

enable symbolic operations far the modeling and manipulation of real and virtual 
world data with different types of information. The representation of visual object 
has preoccupied computer vision and graphics researches for several decades. 
Before the emergence of computer animation, research mainly focused on the 
30 modeling of rigid shapes. Despite the large body of work, most techniques 
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lacked the flexibility to model non-rigid motions. Only after the mid 1980's were a 
few modeling methodologies for solving deformation problems. 

An effective object modeling methodology should characterize the object's 
features under different circumstances in the application scope. In the early 
5 research, focus was placed on appearance modeling since the objects involved 
in vision and graphics applications were mostly simple and stationary. Currently, 
simple unilateral modeling can no longer satisfy the requirement in dynamic 
vision and computer animation. As discussed in G. Burdea and P. Coiffet, Virtual 
Reality Technology* John Wiley and Sons, Inc., 1994, a complete 3D object 
10 modeling should ideally comprise at least the following components. 

1 ) Geometrical modeling: this is the basic requirement for any vision or graphics 
system. It describes an object's geometrical properties, namely, the shape (e.g., 
Q polygon, triangle or vertex, etc.) and the appearance (e.g„ texture, surface 

reflection and/or color). 

^ 15 2) Kinematics modeling; this specifies an object's motion behaviors that are vital 

for dynamic vision and animation. 4x4 or 3x4 homogeneous transformation 
7; matrices can be used to identify translations, rotations and seating factors. 

3) Physical modeling: physical modeling is required for complex situations where 
the object is elastic and/or deformations and collisions are involved. Objects can 

W 20 be modeled physically by specifying their mass, weight, inertia, compliance, 
0 deformation parameters, etc. These features are integrated with the geometrical 

^ modeling along with certain physical laws to form a realistic model 

4) Behavior modeling; this is the least studied aspect of modeling. In intelligent 
modeling, an object can be considered as an intelligent agent that has a degree 

25 of intelligence, It can actively respond to its environments based on its behavioral 
rules. 

Hereinafter, current modeling methodologies are reviewed and classified 
into three categories: continuous modeling, discrete modeling and graph-based 
modeling. 
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Continuous Modeling 

This type of modeling approximates either the whole or a functional part of 
the 3D object by a variation of geometrical primitives, such as blocks, 
polyhedrons, spheres, generalized cylinders or superquadrics. These 
5 geometrical primitives can be expressed as continuous or piecewise continuous 
functions in 3D space. Kinematic and physical features can be easily combined 
with the geometrical shapes. Among the large body of the geometrical primitives, 
generalized cylinders and superquadrics are the popular ones since they could 
easily handle deformations. Barr is considered as one of the first to borrow the 
1 0 techniques from linear mechanical analysis to approximate visual 3D objects, 
see A. Barr, Superquadrics and Angle-preserving Transformations, IEEE 
Computer Graphics Applications, 18:21-30, 1981, and A. Barr and A. Witkin, 
O Topics in Physically Based Modeling, ACM SIGGRAPH '89, Course Note 30. 

sj New York, 1989. He defined the angle-preserving transformations on 

% 1 5 superquadrics. Although the original approach is only for computer graphics, it Is 
;B also useful in vision tasks with fruitful results. As a dynamic extension of 

,q superquadrics, the deformable superquadrics proposed by D. Terzopoulos and 

\. D. Metaxas, Dynamic 3D models with local and global deformations: Deformable 

H Superquadrics, IEEE Transactions on PAMI, 13(7):703-714, 1991 , is a physical 

S 20 feature-based approach. It fits complex 3D shapes with a class of dynamic 
g models that can deform both globally and locally. The model incorporates the 

global shape parameters of a conventional superellipsoid with the local degrees 
of freedom of a spline. The local/global representation simultaneously satisfies 
the requirements of 3D shape reconstruction and 3D recognition. In animation, 
25 the behaviors of the deformable superquadrics are governed by motion 

equations based on physics. In 3D model construction, the model is fitted with 
3D visual information by transforming the data into forces and simulating the 
motion equations through time. 
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In animation tasks, it is easy to detect, attach and apply geometrical, 
kinematic and physical parameters to continuously modeled objects. However; it 
is difficult for behavioral features since the model lacks a symbolic structure as 
the base to fit in behavioral languages. Furthermore, to form any real world 
5 objects, approximation by those pre-defined primitives such as generalized 
cylinders or superquadrics is impossible. 
Discrete Modeling 

A wide variety of computer vision applications involve highly irregular, 
unstructured and dynamic scenes. They are characterized by rapid and non- 
10 uniform variations in spatially irregular feature density and physical properties, it 
is difficult to model such objects from any of the four aspects mentioned before 
with continuous elements. Such difficulty arises from the unpredictable behaviors 
of the objects. Discrete modeling Is able to approximate the surfaces or volumes 
of this kind of objects by vast patches of very simple primitives, such as polygons 
IS or tetrahedrons. 

Since most graphics applications use polygons as the fundamental 
building block for object description, a polygonal mesh representation of curved 
surfaces is a natural choice for surface modeling as disclosed in G. Turk, Re- 
Tiling Polygonal Surfaces, Computer Graphics (SIGGRAPH '92), 26(2):55-64, 
g20 1992. Polygonal approximation of sensory data is relatively simple and sampled 
surfaces can be approximated to the desired precision, see M. A. Khan and J. M. 
Vance, A Mesh Reduction Approach to Parametric Surface Polygonization, 1995 
ASME Design Automation Conference Proceedings, Boston, MA, Sept. 1995. 
Physical and kinematic features can be associated more flexibly with either a 
25 single element (a polygon) or a group of elements (a patch of polygons). 

Triangular mesh is a special case of polygonal mesh. It has been 
recognized as a powerful tool for surface modeling due to its simplicity and 
flexibility and abundance of manipulation algorithms. They are used in many 
general vision and graphics applications, which provide fast preprocessing, data 
30 abstraction and mesh refinement techniques, see see for example R.E. Fayek, 
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30 Surface Modeling Using Topographic Hierarchical Triangular Meshes, Ph.D 
Thesis, Systems Design Eng., University of Waterloo, Waterloo, Ontario, 
Canada, April 1996, L. De Floriani and E. Puppo, Constrained Delaunay 
Triangulation for Multi-resolution Surface Description, Pattern Recognition, pp. 
5 566-569, 1988, and S. Rippa, Adaptive Approximation by Plecewlse Linear 
Polynomials on Triangulations of Subsets of Scattered Data, SIAM Journal on 
Scientific and Statistical Computing, pp. 1123-1141,1992. 

Based on the triangular mesh, Terzopoulos and Waters had successfully 
attached physical constraints on the human facial model as disclosed in D. 
10 Terzopoulos and K Waters, Analysts and Synthesis of Facial (mage Sequences 
Using Physical and Anatomical Models, IEEE Transactions on PAMI, 15(6):569- 
579,1993. From sequences of facial images, they built mesh models with 
0 anatomical constraints. An impressive advance of the methodology is that it has 

Q the capability to model the behavioral features. With the support of anatomical 

^ IS data, different emotions can be modeled and applied to arbitrary human faces, 
fl The main drawback of discrete methodologies is that it lacks the high level 

J structure to control the modeling or to perform symbolic operations. Furthermore, 

^ the discrete primitives are unstructured and only contain information about local 

f& features. When applied to dynamical cases, even though the geometrical 

ISj20 modeling can be highly precise, abstracting high level information from the data 
3 is still problematic. 

Graph-Based Symbolic Modeling 

In graph-based approaches, a complex object is usually represented 
explicitly by a set of primitives and the relations among them in the form of a 
25 graph. If the primitives are regular blocks such as cylinders, cubes or 

superquadrics, such model can be viewed as a clone of continuous modefing. If 
the model consists of vast number of primitives that are discrete polygons or 
polygonal meshes, it Is an extension of discrete model, 

The graph representation was first put forward in 1970 neither for 
30 computer vision nor for graphics applications, but for the script description of a 
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scene in artificial intelligence, see A. C. Shaw, Parsing of Graph-Representation 
Pictures, Journal of ACM, 17:453-481, 1970. In the early 1980's, Shapiro (LG. 
Shapiro, Matching Three-dimensional Objects using a Relational Paradigm, 
Pattern Recognition, 1 7(4);385-405, 1984) applied relational graphs for object 

5 representation where the quadtree (for surface) and octree (for solid) encoding 
algorithm given by Meagher (D. J. Meagher, Octree encoding: a new technique 
for the representation, manipulation, and display of arbitrary three-dimensional 
objects by computer, Technical Report, IPL-TR-80, 111, Image Processing 
Laboratory, Rensselaer Polytechnic Institute, Troy, NY, April 1982). It can be 

10 viewed as a special case since they form trees (directed graphs) as the 
hierarchical structure for representation. In the same period, Wong et al. 
introduced the attributed hypergraph representation for 3D objects see A. K. C. 
Wong and S. W. Lu, Representation of 3D Objects by Attributed Hypergraph for 
Computer Vision, Proceedings of IEEE S.M.C International Conference, pp. 49- 

15 53, 1983. Later, random graph (A. K. C. Wong and M. You, Entropy and 

Distance of Random Graphs with Applications to Structural Pattern Recognition, 
IEEE Transactions on PAMI, 7(5):599-609, 1985) and more sophisticated 
attributed hypergraphs are presented as geometry and Knowledge 
representation for general computer vision tasks (A. K. C. Wong and R. Salay, 

20 An Algorithm for Constellation Matching, Proceedings on of & h International 
Conf. on Pattern Recognition, pp.546-554, Oct. 1986 and A. K. C. Wong, M. 
Rioux and S, W. Lu, Recognition and Shape Synthesis of 3-D Objects Based on 
Attributed Hypergraphs, IEEE Transactions on PAMI, 11(3):279-290, 1989. 
In 3D model synthesis, random graphs were applied to describe the 

25 uncertainties brought by sensors and image processing, see A. K. C. Wong and 
B. A, McArthur, Random Graph Representation for 3-D Object Models, SPIE 
Milestone Series, MS72:229-238, edited by H. Nasr, in Model-Based Vision, 
1991. In Wong et al., an attributed hypergraph model was constructed based on 
model features (A. K. C.Wong and W. Liu, Hypergraph Representation for 3-D 

30 Object Model Synthesis and Scene Interpretation, Proceedings on 2nd 
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Workshop on Sensor Fusion and Environment Modeling (ICAR) t Oxford, U.K M 
1991), The representation had a four-level hierarchy that characterizes: a) the 
geometrical model features; b) the characteristic views induced by local 
image/model features, each of which contains a subset of the model features 
5 visible from a common viewpoint; c) a set of topological equivalent classes of the 
characteristic views; and d) a set of local image features wherein domain 
knowledge could be incorporated into the representation for various forms of 
decision making and reasoning. 

Since graph-based modeling approaches introduce the concepts of 
10 primitives and their relations, it is straightforward for constructing a hierarchical 
representation. At lower levels, geometrical, kinematic and physical features can 
be encoded as the primitives and their associated attributes. At higher levels, 
3 entities such as edge9 and hyperedges can be used to represent symbolic 

information. In a graph structure, it Is handy to perform symbolic operations. With 
!| 1 5 the aid of machine intelligence, domain knowledge can be learned and later 
~0 recalled and processed together with other data. However, traditional graph- 

| based methods lack the representational power for dynamic objects. Although 

they have the potential to handle deformations/transformations, yet up to now T 
U they are applied only to rigid object modeling, due to their rigid structure and the 

^ 20 lack of transformation operators on graphs. 

D The fast paced developments in computer vision, computer graphics and 

Internet motivate the need for a new modeling methodology using a unified data 
structure and manipulation processes directed by machine intelligence, 

25 SUMMARY OF THE INVENTION 

An object of the present invention is to provide a generic and unified 
object modeling for computer vision, computer graphics, animation and 3D 
Internet using machine intelligence. 

In one aspect of the invention there is provided a computerized method of 
30 modeling a three dimensional object, comprising: 
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extracting selected topological features from captured imaging data or 
other surface data of a three dimensional object; 

constructing a triangular mesh representation of said three dimensional 
object from said topological features; 
5 mapping vertices and edges of said triangular mesh to vertices and edges 

respectively of a representative attributed graph; and 

constructing an attributed hypergraph representation from said 
representative attributed graphs. 

In another aspect of the invention there is provided a computerized 
10 method of intelligent model transformation, comprising the steps of: 

computing an optimal subgraph isomorphism for attributed hypergraph 
between a source attributed hypergraph and a target attributed hypergraph, 
wherein said source attributed hypergraph is to be transformed into said target 
attributed hypergraph; 
B 15 computing a sequence of transformation operators from said optimal 

^ subgraph isomorphism for attributed hypergraph; 

| computing a transformation path from said sequence of transformation 

Jl operators; and 

% generating a sequence of attributed hypergraphs along said 

7 20 transformation path from said sequence of transformation operators. 
^ In another aspect of the invention there is provided a computerized 

u\ method of intelligent model augmentation of a real scene with a virtual scene into 

an augmented scene, comprising the steps of: 

constructing an attributed hypergraph representation of said real scene; 
25 constructing an attributed hypergraph representation of said virtual scene; 

computing a sequence of transformation operators between said two 
attributed hypergraphs; 

integrating said two attributed hypergraph representations into a unified 
attributed hypergraph representation using said sequence of transformation 
30 operators; 

constructing an augmented scene from said unified attributed hypergraph 
representation. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The invention will now be described, by way of example only, reference 
being had to the accompanying drawings, in which; 

Figure 1 is a block diagram of an integrated computer vision and graphics 
5 system using the intelligent modeling, transformation and manipulation method 
according to the present invention; 

Figure 2 shows the relationships among geometrical, physical and graph 
representations in the view of category theory; 

Figure 3 graphically illustrates a functor F=(Fobj t Fmor) that maps category 
1 0 Gph to category Set 

Figure 4 shows the net-like structure for attributed hypergraph with nodes 
and links; 

Q Figure 5 shows the primary operators for attributed hypergraphs; 5(a) 

Q union and 5(b) intersection; 

15 Figure 6 shows the primary operators for an attributed hypergraph: 

4) (a) dichotomy; (b) merge; (c) subdivision; (d) join; (e) attribute transition. 

'2 Figure 7 shows the geometrical, physical and behavioral attributes and 

* their relations in AHR; 

U Figure 8 graphically illustrates a compound operator defined on a simple 

^ 20 graph; 

w 

3 Figure 9 illustrates examples of continuous transformations with 

qualitative transition operators on simple graphs; (a) subdivision and (b) join; 

Figure 10 graphically illustrates the relationship of spherical coordinates 
and Cartesian coordinates for a three dimensional point; 
25 Figure 1 1 shows the process to construct face representation from wire- 

frame model; 

Figure 12 illustrates the process of AHR-based augmented reality 
construction; and 

Figure 13 illustrates the principle of texture inversion from 2D image to 3D 
30 surface. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides a new form of attributed hypergraph 
representation (AHR), which when coupled with triangular mesh surface 
approximation, provides a highly efficient method for both 2D and 3D modeling. 
5 For 3D objects, surface modeling is preferred to solid modeling since it frts both 
computer vision and computer graphics applications as common vision sensors 
provide surface data only. A neMike data structure is designed to handle the 
dynamic changes of the representation corresponding to the object's movements 
and deformations so as to overcome the inflexibility of graph structure. Figure 1 
10 gives a block diagram of the integrated intelligent modeling, transformation and 
manipulation system based on AHR> 

For any scientific study on the representation, in particular, for general 
O object modeling, the fundamental problem is: given a collection of instances of a 

certain entity, a set of transformations that can be applied on the entity, and an 
\%IS isomorphism or equivalent relation with respect to the transformations, find the 
*~ general description of the entity and the associated transformations in a 

% mathematical language. In this invention, categorical formalism is brought upon 

to generalize the above problem, since it provides a mathematical language to 
address objects, transformations, invariance and equivalence which is essential 
^ 20 for a general pattern representation and manipulation. 
0 The universality of the categorical language enables formal definitions of 

notions and procedures which, in the past, were empirically defined in the 
computer vision and computer graphics settings, Many concepts in category 
theory can play predominant roles in the formalization of most general 
25 representation problems. It also helps to establish links between apparently very 
different notions (such as projections and transformations), or between different 
aspects of a same problem (such as geometrical modeling and physical 
modeling). 

30 I. Definitions and Notations 
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[Definition 1] A category C is given if a class of elements, called objects, are 
given such that: 

1. for each pair of objects {01, 02) in C, a map u, called a morphism (denoted 
as 01 — v —> 02) is given; 
5 2. for each object 0 in C, there is an identity morphism io, such that, if O 
*> > O', then 0*=0; 

3. the composition (denoted as *) maps of moronisms satisfy the following 
axioms: 

1) if 01 02 " > 03 _jsl> 04, then 01 »" > 03 * > 04 and 
10 01 _jl-> 02 ... r» > 04 

2) identity maps always compose with 0 " > O' to give io' * u = u and w * 
io = u. 

^ The categories commonly met in mathematics have objects that are structured 

\! sets and morphisms. 

5 15 

© [Definition 2] A morphism u from 0 to 0', denoted as 0 " > 0', is called a 

:g retraction if there exists a morphism v from O'to O, such that 0' 0 and u * 

f. v - io'. A morphism w from O to O* is called a contraction, if there exists a 

M> morphism v from O'to O such that 0' v > O and v * u = /o. A morphism u is an 

g 20 isomorphism, if it is both a retraction and a coretraction, and then 0 is said to be 
S isomorphic to O', denoted as 0 ~ O'. If 0 - 0' and O* ~ 0" then 0 - 0". 

[Definition 3] A covariant functor F from category C1 to category C2 is defined 
as a pair of maps F = (Fobj, Fmor): 
25 • objects of C1 objects of C2; 

• morphisms of C1-%^ morphisms of C2. 

or simply as Cf C2. Usually we refer to a covariant functor simply as a 
functor. 
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Figure 3 gives an example of a functor. G1 and G2 are two objects in category 
Gph, and a morphism u is defined such that G1 « > G2. A covariant functor F= 
(Fob/ Fmor) is defined as the mapping from graph (Gph) to set {Set). We have 
G1 -jmu sf and G2 -^U S2. Corresponding to t/. the morphism between S1 
and S2 is Fmor(u). 

[Definition 4J If a functor F defined on categories C1 and C2 preserves all 
retractions (coretractions, isomorphisms) from C1 to C2, it is called a retraction 
preserving (coretraction preserving, isomorphism preserving) functor. 

(Definition 5] The composition of two functors F1 on categories 01, C2 (C1 
-a-> 02) and F2 on categories 02, C3 (C2 -&-> C3) is defined as the functor F2 
5 * Ft on Cf and C3, such that 01 fVF > > C3 

j;; 15 [Definition 6] Two categories C1 and C2 are called equivalent if there exists two 
tf! functors F1 and F2, such that CI -fi-* C2 and C2 C3, and the 

compositions of the two functors have the properties 
^ 1. F2*F1=hr, 

2. F1*F2=Ic2. 

% 20 

■KB* 

y With the concept of functor defined within the scope of computer vision and 

graphics applications, the geometrical, physical and graph representations can 
be viewed as different categories. In 3D case, their relations are shown in Figure 
2. The graph-based representation can be viewed as the result of structural 
25 abstractions from geometrical and physical representations, by applying the 
functors that map Geo and Phy to Gph respectively. 
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[Definition 7] A hypergraph is an ordered pair G * (X, Y), where X = {vi \1<i< 
n) is a finite set of vertices and Y = {Hj | f s/s m} is a finite set of hyperedges; 
each Hj is a subset of X such that Wf uH2 u.,uHm = X 

[Definition 8] An attributed hypergraph (AH) G is a hypergraph defined as an 
ordered pair G=(X, Y) associated with AX and AY, where G =(X, Y) is a 
hypergraph, A* is a finite set of vertex attribute values and A / is a finite set of 
hyperedge attribute values, vi in X may be assigned values of attribute from AX, 
and Hj in Y may be assigned values of attribute from AY. 

In the present invention, the hypergraph is implemented with a net-like 
dynamic data structure (an example in Figure 4). The term "dynamic" signifies 
that the data structure supports structural operations such as join and 
subdivision on hypergraphs on the fly, which, in the view of category theory, are 
related to certain changes of the object via functors. 

In a dynamic structured hypergraph, the basic element is called a node 
shown as a rectangle. The nodes at the bottom level represent the vertices; the 
nodes at the intermediate layers are the hyperedges which are supersets of the 
nodes at their lower levels; and the node at the top layer (called the root) 
represent the entire hypergraph. 

With such a data structure, the structural changes on hypergraphs can be 
handled efficiently. For example, the join of two hyperedges is simply the 
combination of the two corresponding nodes and the re-organization of the 
associated links. Graph-based computations such as the re-organization of a 
hyperedge based on certain criteria can be performed by reshuffling the links 
between the nodes. However, in traditional hypergraph data representations 
such as the incidence matrix or adjacency matrix, any topological change could 
initiate a re-organization over many entities. Data elements with or without 
relations to the nodes to be operated on may have to be accessed. 

In a hypergraph, information represented by nodes at higher layers is 
abstracted from those at lower layers. The nodes at higher layers are important 
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for intelligent manipulation and knowledge processing, while those at lower 
layers are crucial for complete and accurate modeling. The nodes at the bottom 
layer construct an elementary graph. 

5 [Definition 9] In a dynamic structured attributed hypergraph, the nodes at the 
bottom layer are called the elementary nodes, the links among the elementary 
nodes are called the elementary edges. The attributed graph, which consists of 
the elementary nodes and the elementary edges, is called the elementary graph. 
If a hypergraph has n layers, and the node set at layer / Is X/, then 
10 hypergraph G = (X f Y) can be written in the form of G = (Xi t X2, . . . , Xn) where 
X = X1 and Y = {X2, . . Xn}. The meanings of the nodes at the hyperedge 
layers normally depend on the applications. In most applications, we prefer that 

■EST!* 

^ there is a layer of hyperedges that characterizes the object's higher level 

y properties. For example, in the task of environment modeling for indoor 

;Jt; 15 navigation, we can have the hyperedge layers on top of the elementary graph: 
f; • The elementary graph characterizes the basic features captured from the 

/l sensors such as comers and lines; 

^ • The hyperedges at the hyperedge layer extract the organization of the bottom 

nodes, representing higher level knowledge such as the layout of the furniture 
2 20 in a room or the organization of the moms in a building; 

^ • The root node represents the entire scene. 

The bottom layer of an attributed hypergraph should simultaneously satisfy 
the following two requirements: 
25 • The ability to represent subtle features for general patterns, from which one 

can reconstruct a complete object; 
♦ The suitability to integrate the representation into a structural framework. 

The use of the mesh-based representation is ideal due to the unstructured 
30 nature of general 3D shapes. Compared with other mesh-based methodologies 
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such as NURBS, the triangular mesh has the advantages that it is simple and 
flexible. In the vertex-edge structure, a triangular mesh can be written in the form 
of 7- (Vt, Et) where Vt is the vertex set and Et is the edge set of the mesh. 

5 [Definition 10] A representative attributed graph (RAG) G= (V, E) of a triangular 
mesh T is an attributed graph constructed in the following manner: 

• for each vertex vt <= Vt, there is a vertex va e V corresponding to it denoted 

as; w -> va; 

• for each edge Bt e Et there is an edge ea e E corresponding to it denoted 
10 as; et -> es; 

• vf s features are mapped to fa's features; 

• ef*s features are mapped to ea's features, 

£ The AHR of a triangular mesh, which represents a 3D shape, is based on 

"■IX 

Jl 15 such a RAG. The elementary nodes of an attributed hypergraph (i.e., the nodes 

4j at the bottom level) consist of the vertices in the RAG, and the elementary edges 

Q are copied from the edges in the RAG. The properties of the surface attached to 

m the triangular mesh, such as location, orientation, area, color, texture, mass 

;7i density and elasticity, are mapped to the attributes associated with the vertices 

□ 20 or edges. Thus, the RAG of the mesh directly constructs the elementary graph of 

g the AHR. 

JDefinitlon 11] Suppose that X = {xi, x2, . . ., xn} is the vertex set of attributed 
hypergraph H -(X, Y), E ={xel, xe2, . . xem} (EcX, ex < n; k = 1,2, ...,m), 
25 and aei, ae2, . . ., aem are the vertex attributes associating with vertices x&i, xe2, 
. . . , xem respectively. We say that £ is a hyperedge induced by attribute value a, 
if for a selected attribute value a, we have: 

• se {ak\ 1<:k<n) if the corresponding vertex xk e £: 

• a ${ak\1< k< n} if the corresponding vertex xk g E; 

16 
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• E, as a subgraph of H, is connected. 

[Definition 12] The union of two hypergraphs Gi-(Xi, Y1) and G2= (%2, Y2), is to 
form a new hypergraph G - (X, Y), denoted as G ~ G1 u G2, such that X = Xf u 
5 X2and Y*=Ytu Y2. 

[Definition 13] The intersection of two hypergraphs Gf * (Xf, V*^ and G2 =(X2, 
Y2), is a new hypergraph G = (X, Y}, denoted as G = Gm G2, such that X- Xir\ 
X2 and Y=Ym Y2. 

1 0 Figure 5(a) illustrates the operations of union, and Figure 5(b) illustrates 
intersection on two simple hypergraphs. 

3 [Definition 14] The dichotomy of a vertex vd2&X in hypergraph G-(X,Y), 

denoted as (vdi 0 vd2) = vd, can be obtained by the following steps: 
W 15 1 . in X, replace vd by two vertices vd1 , vdz 
m 2. in V, replace vd in all hyperedges by two vertices vdi, vd2; and 

^ 3. add a new hyperedge Hd to V where Hd = fvtft, vtf2j. 

N- An example of dichotomy is shown in Figure 6 (a). 

jrj 20 [Definition 1 5] If in hypergraph G =(X, Y), two vertices We X and W e Xare 
Q adjacent in n hyperedges Hh, H12, .... Hin \nY(nz 1), then the merge of vi and vj, 

denoted as vm = vi © vj, can be obtained by the following steps: 

1. in X, replace vi and vj by a single vertex vm ; 

2. in H/t, H/2, .../ Hin, replace viand vj by a single vertex vm; 
25 3. replace vi or w by vm in any other hyperedges in Y ; 

4. if a hyperedge H contains only w or vy, it is nullified. 
An example of merge is shown in Figure 6 (b). 
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[Definition 16] The subdivision of hyperedge Hs = {vn, vi2, . . ., vin} in 
hypergraph G = (X (where vti eX,i=l,2 ( ..„ n and Hs c Y), denoted as 
(H&1, Hs2, . .. , Hsn) = a Hs, can be obtained by the following steps: 
1 . add a new vertex ve to X, 
5 2. add n hyperedges Hsi, Hs2, Hsn to Y. such that Hsi = {vii, ve} (1< i £ n); 
3. nullify Hs. 

Figure 6 (c) illustrates an example of subdivision on a simple hypergraph. 

[Definition 17] If in hypergraph G = (X, Y), Hji e /and Hj2 e. Y are two 
10 hyperedges with at least one common vertex, then the join of Hj1 and Hj2 in 

hypergraph G~{X, Y), denoted as Hj = Hp v Hj2, is to: 

1. add a new hyperedge Hj s Vsuch that Hj =Hji u Hj2\ 
=5 2. nullify the common vertices of Hji and Hj2 in Hj: 

:j 3. nullify Hji and rV/2. 

ifl 1 5 An example of join is shown in Figure 6 (d). 

j= j (Definition 18] The attribute transition of a hyperedge H e Y (or vertex v e X) in 

h an attributed hypergraph G = (X, Y) denoted as A'H= fa(AH) (or A'v- fa(Av)), is a 

\j mapping AH-&-+ A 'H (or A v a V) which transforms the attribute value AH 

= 20 (or attribute value Av) without any change to hfs connectivities (refer to Figure 6 
I (e))- 

If opt and op2 are two primary operators given In Definitions 12 to 18, the 
compound operator © of opi and op2 is defined as applying op2 on the result of 
25 applying op1 , denoted as © = op2 * opi . 



Figure 6 shows an example. The compound operator © given by the figure can 
be written in the form of: (H3, H7, He) -Hi© H2. In most transformations on an 
AH, a qualitative change operator (union, intersection, dichotomy, merge, 
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subdivision or join) is mostly associated with one or more quantitative change 
operators (attribute transition). 

In the view of category theory, the primary operators and their 
compositions constitute the morphisms between different AH's in the AH 
5 category (Ahp). Through the mapping of functors, they correspond to the 
motions of the 3D patterns. 

(I. Detailed Description of the Methods of Model Construction, 
Transformation and Manipulation 

10 

A. Intelligent 3D Model Construction 

Figure 1 is a block diagram of an integrated computer vision and graphics 
system using the intelligent modeling, transformation and manipulation method 
according to the present invention. The steps In broken boxes are optional and 
Jj 15 are not directly essential for the method of modeling, transformation and 
/! manipulation. Of the optional steps, the entire process may be carried out on one 

£j computer, one or more local computers connected together or the internet may 

li be used as a transmission medium for working the invention between server 

computers and client computers. This will be more fully discussed later. 
m 20 The input data of this model construction algorithm can be from CCD 

Q cameras, laser range finders* radar ranger finders, sonar, or any other devices 

5 that generate images. All mentioned imaging devices produce images that 

typically contain thousands, if not hundreds of thousands of image pixels, (t is 
neither practical nor productive to carTy on object modeling directly at the pixel 
25 level, Thus, the first step of the model construction is to extract the topological 
features from captured imaging data of the three dimensional object shown in 
the image(s). 

The feature extraction process applied in this patent is from the research 
result of Q Gao and A, K. C. Wong ( W A Curve Detection Approach on Based on 
30 Perceptual Organization 11 , Pattern Recognition, Vol 26, No. 7, pp. 1039-1046, 
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August 1993), which is based on the relaxation edge-tracing method ("An 
Applications of Relaxation Labeling to Line and Curve Enhancement", S. W. 
Zuker, R. A. Hummel and A. Rosenfeld, IEEE Transactions on Computers, Vol 
26, pp. 394-403, 1977). It has four steps: 1) feature detection; 2) feature 
5 segmentation; 3) feature classification; and 4) feature grouping. The first step of 
the method detects the salient features such as, for example, points, lines and 
contours from the pixel image, and the second steps finds the topological 
relations among them. In the case of using radar or laser range finders, these 
topological features already have the range information (the distance of the 
10 feature to the Image projection plane). In the case of using conventional camera, 
to recover the range information, at least two cameras are required for each 
view. Then the range information can be calculated using the method given by T. 
S. Huang ("Determining Three-Dimensional Motion and Structure from Two 
H Perspective Views", in Handbook of Pattern Recognition and image Processing, 

jjj 15 Chapter 14, Academic Press, 1986) or by H. C. Longuet-Higgins ("A Computer 
jj Algorithm for Reconstructing a Scene from Two Projections", Nature, Vol 293, 

g pp. 133-135. 1981). 

L, The process to create the triangular meshes can be run directly on the 

M- extracted topological features. However, it is desirable to organize them into a 

g 20 geometrical model to represent the shape of the object first, which can simplify 
2 the process of triangular meshes generation, especially when the features 

extracted are sparse. 

The most straightforward method to represent a shape is the use of wire- 
frames consisting of the surface discontinuity of the object. The two types of 
25 elements in a wire-frame model are lines and nodes. The lines correspond to the 
straight or curves, and the nodes are point features such as corners or points of 
abrupt curvature change. We consider the face modeling as a more generic dual 
of the wire-frame model. In a face model, the elements are the faces (surfaces) 
bounded by closed circuits of wire-frame edges. Using faces to model an object 
30 has several advantages over modeling with the wire-frames: 
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• fewer faces than points/lines are required to describe the same object; 

• a face description enables information on the object, such as geometry, 
surface colors and textures to be attached to the model more easily; 

• a face can be bounded by any type of closed curves in addition to the wire- 
frame edges, making it straightforward to enclose any type of contours in the 
object. 

The procedure to extract faces is illustrated in Figure 11. First, a procedure 
called Tablet_Point_LineQ handles the points, lines and contours. An incidence 
table is generated as shown in Table 1, where Pi (i=1,2...,k) are the point 
features, and Lj Q-l,2,...m) are the line or contour features. Empty columns, i.e., 
the lonely points, are considered as noise and are removed from the table. 

Table 1: The incidence table of lines/contours and points. 





Pi 


P2 


.. Pk 


L1 


1 


0 . 


0 


Lm 


0 


0 


1 



Merge_Points() and MergeJUnesQ are the procedures to eliminate duplicated 
features resulting from potentially noisy input data (e.g., the reflections around a 
sharp edge). Merge_Points() scans through the set of corners. The Euclidean 
distance between pairs of neighboring corners is evaluated. If its value is less 
than a preset threshold (normally the distance threshold is about 2-3 pixels in 
length), the two points are merged Into one. MergeJUnesQ scans through the set 
of lines and contours. If two lines or contours are close to each other and the 
angle between them is very close to 0 or*, they are merged into one. According 
to the merging result, the incidence table is adjusted. Then the faces are 
extracted by procedure Extract_Faces(). Each face is represented by: 

1 . a set of lines or contours to form the face boundary; 

2. a set of points; 
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3. a vector signifies the normal direction of the face. 

The incidence matrix shown in Table 1 is actually a presentation of a 
connected graph. Each face of the object corresponds to a closed circuit. 
5 Searching for a closed cycle in a graph is a common task in linear graph theory. 
Here a depth-first search algorithm is applied. Different from the conventional 
cycle searching algorithm, in the face searching, there is an constraint to judge if 
a cycle consists of a face or not: in the cycle, all features have to be 
approximately on the same plane. The search algorithm begins with a set of 
10 three connected corners, from which an approximated normal vector of the 

possible face can be calculated. During the expansion of the cycle, this normal 
vector is consistently updated according to the new topological features added. 
The face grouping procedure includes the following steps: 
1 . set the face set Sf empty; 
15 2. find all ordered lists of 3 corners in which the 3 corners are connected 

consequently by 2 edges, denoted as the list Li = {ci1, ci2, ci3) with comers 
ci1, ci2 and c/3, where 1<, i1, i2, i3< n, n is the total number of comers, 1<, /<£ 
k is the index of the list, and k Is the total number of 3-corner lists; 
3. let/cf; 

20 4. for each list Li, calculate the normal vector of the plane defined by all the 
corner points in the list, denoted as ni\ 

5. if the tail of Li is connected to its head, then list Li represents a face; put U in 
Sf, and go to step 8; 

6. if there exists a corner cj in the comer set that satisfies: 
25 • cj is not in Li yet; 

• cj connects the current tail of Li\ 

• cj is approximately in the plane defined by all the corners in L/; 

then put cj in Lias the new tail; if no corner satisfies the above conditions, go 
to step 8; 
30 7. go to step 4; 
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8. let / = i*1, then if / < k then go to step 4; 

9. output Sf and exit. 



Finally, procedure Group_FacesQ groups the faces together if they have one or 
5 more common edges and their normal vector are collinear/parallel. 

Building a triangular mesh is straightforward from the face model. The faces 
are first converted into the forms of polygons or polygonalized circles. The 
following recursive procedure decomposes a polygon PO with n edges and n 
10 comers into n-2 triangles: 

1 . set the triangle list Tempty, let m = n; 

2. \etj-0\ 

ii 3. in polygon Pj, start from any point, label the corners from 1 to m 

H consequently, denoted as d, c2, cm, where m is the number of corners in 

IJ 15 polygon Pj; 

a 4 - let / = 1\ 

5. add an edge from corner / to comer i*2 to make up a new triangle t = {ci, 
ci+1, cr+2}, and add t to T (if /+2 = m + 1, then use d as c/>2); 
J 6. let/*/+2; 

u 20 7. if Km, go to step 5; 

J 8. use the nodes d, c3 , c2k+1 and the newly added edges to construct 

a new polygon Pj+1 with reduced numbers of corners and edges; 
9. if Pj+1 is a triangle, then add Pj+1 to 7, output 7 and exit; 
10.lety=/+f,gotostep 2. 

25 

Since the triangles generated by the above processes are directly from the 
face models, they automatically satisfy the conditions required for a triangular 
mesh. If the topological features extracted from the imaging data are dense, then 
triangular meshes can be generated using an improved version of the well- 
30 known Delaunay triangulation method given by R. E. Fayek and A. K. Wong 
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("Triangular Mesh Model for Natural Terrain", Processings on SPIE: Intelligent 
Robots and Computer Vision, Oct 1994). However, to apply such method, a 
transformation of the features is required since Delaunay triangulation only works 
on data that lay approximately on the same plane. The topological features of an 
object are transformed from the conventional Cartesian coordinate system, 
where location is defined by (x, y, z), into a spherical coordinate system centered 
within the object, where location is defined by fa, (3, r) (refer to Figure 10), The 
entire process of applying Delaunay triangulation is: 

1 . compute a spherical coordinate system for the three dimensional object to be 
modeled; 

1) compute the geometrical center Pc = (Xc, Yc, Zc) from the coordinates of 
the points extracted from the image: 

Xc ■ (Xmax+Xmln)/2.0, Yc - (Ymax+Ymin)/2.0, Zc = (Zmax+Zmin)/2.0 
where Xmax, Ymax, Zmax, Xmin, Ymin and Zmin give the range of the 
object occupies in the space expressed in Cartesian coordinate system. 

2) compute the center of gravity Pg = (Xg, Yg, Zg) of all topological features: 

Xg = (LXI)/n, Yg = (ZYl)/n, Zg = $Zi)/n 

where n is the total number of feature points and (Xi, Yi, Zi) is the 
coordinate of i 4h point expressed in Cartesian coordinate system. 

3) compute the major axis, which is given by vector Va = (Pc, Pg); 

4) compute a minor axis, which is given by vector Vm, where Vm is: 

V/ = Va X (Pc, Pk), 
where X signifies the cross product, and Pk is the feature point that has 
the largest distance to vector Va. 

5) construct the spherical coordinate system, such that Pg is the center, Va 
is the direction of ot=0, Vi is the direction of p=0. r is the distance to the 
center. 

2. transform the coordinates of said face model from the conventional three 
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dimensional Cartesian coordinate system to said spherical coordinate 
system; for transformed point (a, p, r). its relation to (x, y, z) satisfies: 
x = r sina cos(5, y = r sina. s/np, z = r cosp 

3. apply Delaunay trlangulation using r=0 as its reference plan, that is, 
performing the trlangulation on (a, p), while using r as the reference feature. 

4. transform the resulting triangular meshes from spherical coordinate system to 
conventional Cartesian coordinate system. 

After a set of triangular meshes are obtained, they can be directly mapped 
to the representative attributed graph (RAG) according to the procedures given 
in Definition 10. Hyperedges are then induced from the elementary graph in 
hypergraph, which is a dual of RAG. Different from an edge in an attributed 
graph, a hyperedge in an attributed hypergraph represents the n-ary relation 
among a number of (at least two) vertices, groups of vertices or other 
hyperedges in the attributed hypergraph. In this invention, the hyperedges are 
induced from the triangles in the RAG. The steps are: 

1. select at least a type of attribute used for hyperedge inducing (for object 
modeling, typical feature types are surface adjacency, orientation, color and 
texture); 

2. select the value bins Bi-[bn, 6/2), \=1, 2, ... k, of the attribute values for the 
selected attribute type(s), where k is the total number of bins; 

3. group all attributed vertices into groups, such that all attributed vertices in the 
same group and connected and have attributed values fall into the threshold: 

1 ) set the group list G empty, let n=0; 

2) letn=rt+7,/=0; 

3) find a triangle Tin the RAG that is not in any group In G-{G1, G2, ... Gn- 
1), if all triangles have been put into G, then exit; 

4) compute attribute value b of T, find the bin Bi such that b s Bi, and put T 
in Go, let y=f ; 
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5) if a triangle 7" in the RAG is adjacent to any triangles in Gn, compute its 
attribute value b\ if b'e Bi, then add T into Gn, and let 

6) repeat step 4) until there is no T found that satisfies the condition to be 
put into Gn] 

7) go to step 2). 

4. compute an attribute value by computing the average attributed for all 
triangles in the group and assigned It as the attribute of the group; 

5. assign an attributed hyperedge for each group. 

The hyperedge grouping criteria depend on the types of attributes selected 
for the hyperedges. Such attributes can be kinematic attributes, geometrical 
attributes, physical attributes, and/or behavioral attributes. Figure 7 shows the 
levels of abstraction and their relations with triangular mesh and attributed 
hypergraph. 

In a typical application, there will be more image views. For example, for a 
complete camera-based 3D object model, at least four views are preferably 
taken, one for the front, one for the back, and two for the left and the right sides 
of the object. For each view, following the procedure of attributed hypergraph 
construction given above, we can obtain one attributed hypergraph. Therefore, a 
method to synthesize all these attributed hypergraphs is required to compute a 
complete attributed hypergraph representation (AHR) of the object. The 
procedure comprises the following steps: 

1. determine the overlapping region of all available attributed hypergraphs Hi, 
H2, ...Hk, where k is the number of attributed hypergraphs: 

1) transform the coordinates of points in the form of (x, y, z) that correspond 
to the vertices in the hypergraphs to points in spherical coordinate system 
in the form of (ct, p, r) as described before; 

2) for all point (a, p, r), let x'=oc, y'=p and z'-r; 

3) project (*', y' zO to the plane z'=0; 
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4) determine the overlapping regions 01, 02, ...Om, and non-overlapping 
regions Om*i, O/n+2, ... Om+n, by examining all projected points. 

2. in each overlapped region O/, 1 < I < m+n, using all the points, re-compute a 
triangular mesh Ti by Delaunay triangulation method as described before; 

3. for triangular mesh Ti, mapping the re-computed triangular meshes to a new 
RAG Rr, 

4. form a new attributed hypergraph HVfrom each Ti 

5. integrate all the attributed hypergraphs in a new integrated attributed 
hypergraph H by applying the union operater V given by Definition 12: 

H = H'1v H'2 u H'm u H'm+1 u H'm+2 u . . .u H'm+n 

6. apply the hyperedge inducing method described above to compute attributed 
hyperedges for the new integrated attributed hypergraph H; 

7. associate the newly computed attributed hyperedge with H to form an 
attributed hypergraph representation (AHR). 

During the synthesis process described above, a more complete AHR is 
constructed. The possible redundant information in the input AHR's obtained 
from different views enables one to perform the following improvements on the 
recovered 3D data: 

• Compensate for system errors: 

System errors are brought in by the digitization error, feature detection error 
and computation error etc. The errors can be compensated by the averaging 
effect in the synthesis. 

• Recover missing features: 

Occlusions, unrealistic illuminations, or errors in feature detection can cause 
missing object parts in one view. However, the missing parts still have a good 
chance of being present in other views. 

• Eliminate erroneous features: 
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The noises on the object's surfaces and in the background may form 
erroneous features. In AHR synthesis, a score representing the number of 
appearances for each feature can be added. Since the noise has very little 
chance of being repeated at the same place every time, it will have a very low 
score. At the final stage, features with low scores can be eliminated. 

The same model construction concept can be easily applied to 2D 
objects. If the topological features extracted for AHR construction are in 2D 
domain instead of 3D domain, we can easily construct the AHR in the same way. 

B. Intelligent 3D Model Transformation 

One of the most important applications of attributed hypergraph 
representation (AHR) based object modeling is automatic transformation of 3D 
objects. A general transformation is basically the metamorphosis of an object 
from one state (the source) to another (the target) via continuous transitions 
(morphing). AHR provides an ideal tool for 3D transformation on object's shapes, 
colors, textures and many other properties. Given two AHR's representing the 
two states of an object, a set of operators defined for attributed hypergraph can 
be extracted. Moreover, a sequence of intermediate 3D objects under continuous 
deformation or transformation can be reconstructed via the construction of the 
intermediate AHR's. 

When two states of the same object are represented by AHR's, the 
transformation corresponds to a sequence of operations applied on the source 
AHR such that it finally changes itself into an AHR that is isomorphic to the target 
AHR. The AHR-based transformation disclosed herein is able to: 

1. Automatically determine a transformation path between two objects; and 

2. Justify the qualitative and the quantitative changes in the transformation. 
To produce the metamorphosis of an object from one state to another one, 

there may exist many paths. There are usually certain constraints controlling the 
transformation thus making one path more preferable than others. The volume 
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preserving and the surface preserving constraints are the popular ones in many 
applications. These methods usually use Minkowski sums or a merged topology 
to control the time-varying object. An AHR-based transformation is different in 
that it uses feature preserving constraints. 
5 In an AHR-based transformation, given the initial AHR G1 and the final AHR 

G2, an optimal subgraph isomorphism algorithm first finds the optimal common 
sub-hypergraph of the two AHR's. Since the distance measure applied in the 
optimal subgraph isomorphism algorithm is confined by the type of selected 
features, the transformation path generated follows the path with the minimal 
10 cost subject to the feature type. In other words, the algorithm attempts to find the 
path that minimizes the cost of subgraph isomorphism associated with the 
selected feature type. Thus, It is called feature preserving transformation. 
Q The operator sequence that performs the transformations between the two 

Cj AHR's can be similarly determined. The operator extraction begins with the 

global motion parameters rotation matrix R and translation vector ffrom all the 
•£i matched elements of the two AHR's, which are vertices, edges or hyperedges* 
; | The least squares method used in estimating R and T minimizes the total 

displacement for the matched elements. Since there are only quantitative 
^ changes for those matched ones, the qualitative Information associated with 
;;:$0 them is well preserved. Meanwhile, the optimal subgraph isomorphism algorithm 
a minimizes the number of unmatched elements since they will introduce penalties 
that have the highest cost in matching. Hence, during the process of deforming 
Gi into G2 by applying the operator sequence, the qualitative features in the 
source AHR are maximally preserved. 
25 If the two AHR's used for transformation are constructed through the 

procedures described in Section A above, they should have four layers each, 
one for root, one for hyperedges, one for edges and one for vertices. If one or 
both have more than one layer of hyperedges, these layers have to be combined 
into one using the hyperedge join operator given by Definition 17. The 
30 procedures of AHR-based object transformation are; 
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1. Iet/=f; 

2. for same I th layer of the two AHR's G1 and G2, by the definition of attributed 
hypergraph (Definition S and the description following that, as well as Figure 
4), the layers themselves are two attributed graphs respectively; label them 

5 as Gn and G2i respectively; 

3. for each pair Gf/and G2i, compute the optimal subgraph isomorphism S/for 
attributed graph as given by A. K. C. Wong, M. L. You and S. C. Chan ("An 
Algorithm for Graph Optimal Monomorphism", IEEE Transctions on S. M. C. 
Vol. 20(3), pp. 628-636, 1990); the distance associated with S/is denoted as 

10 di; 

4. if i<4, let M+1 , and go to step 2; 

5. sum up di for M, 2, 3, 4, and let a sequence S-(Si, S2, S3, S4) as the 
optimal subgraph isomorphism for attributed hypergraph; 

=j 6. compute the transformation operators corresponding to matched graph 

p 5 elements (attributed vertices, attributed edges or attributed hyperedges) from 

•£} S: 

3 1) from S, find all the matched graph elements in the form a common 

J attributed subgraph g; 

2) compute the estimated rotation matrix R and the translation T from the 
JS20 displacements of all matched elements; Theoretically, three matched 

y pairs of elements are enough to recover R and T since R and 7" only have 

six degree of freedom plus a scaling (zooming) factor, however, it is 
desired to have more than four pairs and use a least squares method to 
insure the best estimation; 
25 3) back project the calculated R and T to the elements in the initial graph to 

obtain their computed displacement; 
4) convert each element's displacement into an attribute transition operator 
according to Definition 18; 
7. compute the transformation operators corresponding to unmatched graph 
30 elements from 5: 
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1 ) traverse all elements in g computed in step 6-1) and search for 
unmatched ones that are adjacent to at least one element in g: 

2) if an unmatched node nu is adjacent to node ng in g while nu is in G2 , 
then 

a) apply operator subdivision or dichotomy to ng to yield two new 
elements m and n2\ 

b) apply two attributed transition operators to m and n2 respectively to 
transform nfs attribute value to ng's, and n2 to nu'& 

3) if an unmatched node nu is adjacent to node ng in g while nu is in G1 , 
then 

a) apply operator join or merge to ng and nu to yield one new element np; 

b) apply an attributed transition operator to np to tranform np's attribute 
value to ng's; 

4) go to step 1) with updated (operators have been applied) Gi, G2. and g 
until there is unmatched element; 

8. enlisting all said transformation operators into a sequence applied and 
eliminate the ones that cancel each other. 

With the sequence of operators extracted, the transformation path is well 
defined by applying the operators one by one. The automatic transformation 
frame generation is based on the definition of distance between attributed 
hypergraphs given by A. K. C. Wong, M. L. You and S. C. Chan {"An Algorithm 
for Graph Optimal Monomorphism", IEEE Transctions on S. M. C. Vol. 20(3), pp. 
628-636, 1990). If d(Gi, G2) is the distance between two AHR's Gi and G2. we 
can use linear interpolation to generate continuous frames between Gi and G2. 
Suppose that in the transformation, the total time from pattern Gf to G2 is T. 
Given any time t such that 0<t<T, the corresponding transformed AHR at t, 
denoted as G(t) can be expressed as: 

G(t) =Gi + [(VT)* (G2 - G1)] (1 ) 

Where Gl -Gi signifies the difference between Gi and G2 measured by the AH 
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operator set, and V means using its left-hand-side as the arithmetic operator to 
quantify its right-hand-side operator set. 

The connotation of Equation (1) is that G(t) can be built by applying a proper 
set of AH operators that varies with t. Suppose that the set of operators to 
5 transform Gi to G2 is Op = {opi, op2, . . ., opn}, where op/ (? < i< n) can be any 
one of the primary attributed hypergraph operators defined previously 
(Definitions 12 to 18), and the set of operators that transform Gi to G(t) is 
expressed as Op(t)- {opi(t), op2(t), . . opn(t)}. Referring to Equation (1). opi(t) 
is derived from opi by: 
10 • if opi is an attribute transition operator, opi(t) is also an attribute transition 
operator. 

> opi(X)'(1-VT)opi (2) 
% • if op/is intersection, dichotomy or subdivision, then: 

y > opi(t) = Opi if 0 < t < T 

!|i5 > opi(t) = lop otherwise (3) 

S where top denotes the identity operator that does not change the input AHR; 

iO if op/ is accompanied by an attribute transition operator fa-op'i, then: 

U > fa-opi(t) = (1-t/T)fa-opi (4) 

'f, • if opi is union, merge or join, then: 

1320 > op/ft) = top if 0<t<T 

R > opi(t) = opi otherwise (5) 

where lop denotes the identity operator that does not change anything; if opi 
is accompanied by an attribute transition operator fa-opi, we have: 

> fa-opi(t)= (1- t/T) fa-opl (6) 



25 



Figure 9 gives two examples of continuous transformations on simple graphs 
with qualitative transition operators applied. 
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C. Augemeted Reality by Intelligent 3D Model Manipulation 

This invention provides an efficient and practical methodology for vision- 
based object modeling (Section A) and transformation (Section fl) which has 
broad applications in computer vision, graphics and virtual reality. One useful 
application is to create augmented reality (AR) by Intelligent model manipulation. 

Augmented reality is a technology that first employs computer vision 
methodologies to estimate the 3D geometry of real world scenes, and then 
facilitates computer graphics technologies to augment the synthesized models 
and to present them in a way similar to that of virtual reality (VR). 

To build an augmented scene, virtual objects/scenes have to be combined 
with the real objects/scenes constructed by the process described in Section A. 
With this invention, imaging data from the real world can be sensed and modeled 
by AHR. Similarly, virtual scenes are also modeled by AHR. The AHR's from 
both the real scene and the virtual scene are then integrated into a unified AHR, 
which represents the augmented scenes. 

The term augmentation used herein means not only integrating different 
AHR's, but also supplementing the AHR's with elements that may appear neither 
in the real scenes nor in the virtual scenes. With AHR, object properties such as 
surface colors and textures are represented as attributes, which makes it easy 
for us to perform the integrating and supplementing tasks required by most AR 
applications. Figure 12 gives a simple illustration on how AHR can be applied in 
the construction of augmented reality. 

The procedure of constructing augmented reality from a real scene and a 
virtual scene is: 

1 . constructing the AHR of the real scene using the method given by Section A; 

2. constructing the AHR of the virtual scene using a computer graphics tool; 

3. computing the sequence of transformation operators between said two 
attributed hypergraphs using the method given in Section 8; 

4. integrating the two AHR's corresponding to the real scene and the virtual 
scene respectively into a unified AHR that represents the augmented scene 
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by applying the sequence of transformation operators computed in step 3; 
and 

5. recontructing the augmented scene from the AHR obtained in step 4. 

The core of this procedure is performing the augmentation by AHR 
integration (step 4), which is achieved by applying a proper set of AH operators. 
Typical augmentations include: 

• Integration with virtual scene: 

A typical AR world normally consists of the objects constructed from the real 
3D scene and additional objects or backgrounds imported from the virtual 
world. One of the most important problems in the integration procedure is the 
proper alignment of the virtual parts and the real parts. 

• Change of surface colors and/or textures. 

The objects in the AR world may change their surface colors and/or textures 
to present certain effects (for example, the shadow and the color changes 
due to the virtual light sources); 

• Shape changes by deformations: 

When the AR world is presented for physical simulation, it is possible that the 
objects, either from real views or from a virtual world, have shape changes 
due to the physical interactions among them. 

In Definition 12 and Definition 13, we have defined the operators u (union) and 
n (intersection) on two attributed hypergraphs. which can be directly applied for 
the task of AHR integration. Suppose that the AHR of the real scene is Grand 
the AHR of the virtual part is Gv, the combined AHR G is obtained by: 

1. calculate Gu * Gru Gv and G; -Grr\ Gv, 

2. let Go = Gu \ G/; 

3. align the attributes of all entities in Gr, 

4. set G = Go u Gi. 
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The alignment of the attributes of two AHR's depends upon the application 
context. Normally, it only involves translations and/or rotations on the AHR's to 
properly position the objects in the AR scene and thus is handled by one or more 
attribute transitions. However, in some cases, there may be qualitative changes 

5 of the AHR's associated with the integration. In the task of object fusion which 

may be required by AR tasks, the integration of the two AHR r s that represent the 
two objects, denoted as Gr and Gv t has to be followed by eliminating ail 
redundancies (e.g«, inner surfaces or inner vertices) to ensure a fully integrated 
shape. The following steps will remove the inner vertices and surfaces; 

10 1 . calculate G which is the integration of Gr and Gv (with possible inner vertices 
and/or surfaces), and Gi = GmGv; 

2. if a vertex vk € Gi, vk is an inner vertex; 

1 (a) find a vertex vi € G such that vi is adjacent to vk and the discrepancy 

1 between vfs attribute value Aw to vk's attribute value Avk is the smallest 
M 15 among all vertices that are adjacent to vk\ 

p (b) in G, apply merge on wc and vr, 

3. if a hyperedge Hk e Gi, Hk becomes an inner hyperedge: 

a) find a hyperedge HisG such that Hi is adjacent to Hk and the 
=i discrepancy between Ht$ attribute value Aei to Hk's attribute value Aek is 

^20 the smallest among all hyperedges adjacent to Hk; 

2 b) in G, apply join on Hk and HK 

4. if there is no inner vertex or hyperedge, exit, otherwise go to step 2, 
Another important procedure in performing the augmentation is object 

reconstruction from AHR. Since the AHR has the geometrical information stored 
25 on its elementary graph, which is constructed from the triangular mesh model 
through RAG's, the reconstruction of the object shape resembles the 
reconstruction of an object's shape by triangular mesh approximations. Other 
properties of the object, such as surface color and texture can be extracted from 
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the attributes associated with the AHR and applied to the reconstructed 
triangular mesh. 

Natural looking surface textures (or surface colors when texture is not 
considered) are of great importance to the construction of augmented reality. To 
5 meet this requirement, a texture or color or color binding algorithm is adopted to 
estimate the surface appearance for each triangle from the available image 
views. 

Figure 13 illustrates the principle of binding surface appearance information. 
With calibrated camera parameters, the projective relation between the 3D 
10 shape and their image views can be established in the form of a set of 

equations. Then for each triangular surface in the reconstructed model, the 
image view with the largest viewing area of the triangle fs selected. The texture 
and/or color of the projected triangle in the selected image view is inverted into 
v; 3D modeling space and clipped on the corresponding triangle. The surface 

0 15 texture (or color) binding algorithm has the following modules; 
ih 1. Texture (or color) region grouping: 

% Since the object is represented by an AHR-based on triangular mesh, 

smoothly connected triangles with homogeneous textures (or colors) in their 
L image views are grouped together to reduce the computational cost. Local 

i;i;20 geometrical features are also considered as constraints to make sure that the 

3 triangles in the same texture (color) group are smoothly connected. A closed 

"~ convex region that has invariant normal vector is preferred for binding to 

reduce the total length of boundary. 
2. Image projection selection: 
25 In most cases, a grouped region is visible in more than one of the images. It 

is reasonable to assume that the greater the exposure of the texture region in 
an image, the more accuracy will be achieved when the region is inverted into 
a model space. Therefore, the image with the largest viewing area of the 
grouped texture region is selected and a projective inversion is performed 
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with the camera (or any other sensor used for capturing the images) pose 
associated with this image. 

3. Boundary processing: 

Special care has to be taken along the boundaries between different texture 
5 regions. A simple averaging filter is added for the focal areas around the 

boundaries to blend the texture. Although this results in a local blurring effect, 
it is more visually acceptable than the abrupt change without the filter, which 
is called the blocking effect in computer graphics. 

4. Texture synthesis for occluded region: 

10 It is possible that the available views do not cover all the surfaces of the 

object/scene. For color attributes, a simple duplication of the color from the 
nearby triangles is adequate. In the case that textures are processed, the 

a texture of the invisible regions is synthesized from visible regions depending 

on the viewing point. The synthesized textures or duplicated colors may not 

fU15 correspond to the real ones but are still less conspicuous then the 

i it 

^1 texture/color holes. 

The problem of shape deformation arises when the objects/scenes are 
T modeled in AHR with physical parameters as attribute values. Physical features 

^20 as attributes in AHR describe deformable objects by (1) the physical properties 
3 of the objects; (2) the physical state transitions; (3) the initial states; and (4) 
y external factors such as the external forces. 

In an augmented scene, the augmentation may include the changes of the 
physical states for the existing objects or the virtual objects, and possibly with 
25 additional external forces. In the above cases, due to the physical interactions 

among the objects, there may be deformations. Therefore, a recalculation on the 
physical states of the objects is required. Further detail of the calculations on 
shape deformations by AHR operators has been given in Section B. 
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D. 3D Web Streaming based on AHR 

Referring again to Figure 1 , the AHR method disclosed herein provides not 
only practical modeling and transformation methods but also an efficient data 
compression method for 3D data. A typical application of such data compression 
5 is 3D Web content streaming. AHR-based 3D Web streaming methodology 

utilizes the method given in Section A to model 3D Web content. It transmits the 
symbolic representation of the Web content instead of the Web content itself. 
Hence, it significantly improves the speed of content transmission over the 
Internet with limited bandwidth. The procedure of 3D Web streaming is: 
10 1 . at the server side, given a piece of dynamic 3D Web content, extract the 

features and represent by AHR Gs; 

2. construct the AHR Gd corresponding to the dynamic portion of the Web 
content; 

3. extract the AHR operator sequence OPd as directed in Section B by 
ij 1 5 comparing Gcf with its corresponding portion in Gs; 

q 4. transmit the AHR operator sequence OPd over the Internet; 

«{ 5. at the client side (on a client computer), given the AHR Gs and its 

corresponding operator sequence OPd, apply the operators to the AHR and 
^ generate a sequence AHR's, which represent the dynamic Web content; 

i;20 6. perform object reconstruction for each of the AHR to form the dynamic Web 
5 content. 

The foregoing description of the preferred embodiments of the invention has 
been presented to illustrate the principles of the invention and not to limit the 
25 invention to the particular embodiment illustrated. It is intended that the scope of the 
invention be defined by all of the embodiments encompassed within the following 
claims and their equivalents. 
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